max rank | avg. rank | sentence |
---|---|---|
325 | 95.1538 | It really was a very different experience said one of the international students. |
328 | 144.1250 | To study more about business in another country. |
340 | 126.0000 | In this case, this product may not start. |
351 | 192.0000 | And most … This week was international week. |
362 | 99.8000 | The health who has in often of found they for. |
405 | 191.0000 | He might know something about our project. |
433 | 129.1250 | You can find contact information on our site. |
450 | 171.5556 | In case of problems you should search here first. |
450 | 186.2500 | Please contact us if you have any problems. |
453 | 136.2000 | The training will be for a period of three days. |
455 | 124.0000 | The standard is the main with the business. |
468 | 280.4286 | The application period is now open until. |
469 | 162.5833 | And you have to plan what to do each week. |
508 | 152.2308 | In the end, this only results in a few minutes each day. |
521 | 116.2500 | How are you different from them, and why do you show you are different from them? |
523 | 102.5000 | If you want something to use for 5+ hours at a time then these are not for you. |
539 | 114.0667 | And, it is your right to know about what they are doing with your money. |
552 | 150.4000 | I really like all of the points you made. |
564 | 157.3846 | And what could be better than doing this in your new home? |
567 | 183.3333 | This week has been a special week at school. |
574 | 236.5000 | It always needs to * have it's own method. |
577 | 167.7500 | You don't need to learn more about your products and services. |
587 | 202.4000 | She can find the best design and size online. |
591 | 143.5455 | A professional web design will get this right, the first time. |
599 | 209.5556 | This should be if a global function is called. |
604 | 364.1250 | A digital personal network levels the learning field. |
626 | 274.4286 | The approach to various levels of education. |
626 | 174.8889 | We should take the same approach in design. |
631 | 143.1000 | There is also a strong support from the Thai government. |
635 | 274.3750 | They could ensure for your family excellent result. |
The maximum word rank of a sentence is by definition the rank of the rarest word in the sentence. If it is low, all words in the sentence are of high frequency. For this reason the table of the sentences with least maximum word number might be of interest. In the table, we see the corresponding sentences with a minimum length of 40 characters.
The over all distribution of the maximum rank in all sentences of the corpus is shown in a diagram with log-scaled x-axis.
The sentences in the table described above are of interest because they are usually easy to understand. The distribution may give insights into the corpus and may give parameters for language comparison.
While the distribution might be deduced from a small corpus, the sentences in the table are rare and a large corpus will give more impressive results.
Table data:
select max(w_id)-100 as m, avg(w_id)-100 as a, s.sentence from sentences s, inv_w i where s.s_id=i.s_id and length(sentence)>40 and i.w_id>100 group by s.s_id order by m limit 30;
Distribution data;
select m, count(*) from (select 100* round((max(w_id)-100)/100) as m from sentences s, inv_w i where s.s_id=i.s_id and i.w_id>100 group by s.s_id) aa group by m;
Explain the distribution, especially the increase in its right part.
4.5.2.2 Average word rank in sentence
4.5.2.3 Sentences consisting of many low frequency words I
4.5.2.4 Sentences consisting of many low frequency words II
4.5.2.5 Sentences consisting of short words only I
4.5.2.6 Sentences consisting of short words only II
4.5.2.7 Sentences consisting of long words only I
4.5.2.8 Sentences consisting of long words only II